#lang scribble/manual
@; -*- mode: Scribble -*-
@;
@; Copyright (c) 2019-2020 The University of Utah
@; All rights reserved.
@;
@; This file is part of Xsmith, a generator of highly effective fuzz testers.
@;
@; Redistribution and use in source and binary forms, with or without
@; modification, are permitted provided that the following conditions are met:
@;
@;   * Redistributions of source code must retain the above copyright notice,
@;     this list of conditions and the following disclaimer.
@;
@;   * Redistributions in binary form must reproduce the above copyright
@;     notice, this list of conditions and the following disclaimer in the
@;     documentation and/or other materials provided with the distribution.
@;
@; THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
@; AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
@; IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
@; ARE DISCLAIMED.  IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
@; LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
@; CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
@; SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
@; INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
@; CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
@; ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
@; POSSIBILITY OF SUCH DAMAGE.

@;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

@(require
"util.rkt"
(for-label
 (except-in clotho/racket/base
            module
            string
            #%app)
 (prefix-in racket/base: racket/base)
 xsmith
 xsmith/racr-convenience
 xsmith/canned-components
 xsmith/app

 racket/contract/base
 racket/dict

 @; if racr had scribble documentation this would provide hyperlinks
 @;racr
 )
)

@title{Xsmith Reference}

@section{Stability}

Xsmith does not currently promise API stability between versions.

@section[#:tag "generated-rules"]{Auto-Generated attributes and choice-methods}
Many attributes and choice-methods are auto-generated by Xsmith:

@bold{attributes}

@itemlist[
@item{@racket['xsmith_is-hole?]

Accepts no arguments other than the node to call it on.
Returns @racket[#t] when called on a hole node, otherwise returns @racket[#f].

Example:
@racketblock[
(code:comment "Within the context of a choice method, this will return #t")
(att-value 'xsmith_is-hole? (current-hole))
]
}
@item{@racket['xsmith_render-node]

Accepts no arguments other than the node to call it on.
Defined by the @racket[render-node-info] property.

Example:
@racketblock[
(add-property
 my-spec-component
 render-node-info
 (code:comment "assuming we're outputting for a lispy language")
 [AdditionExpression (λ (n) `(+ ,(att-value 'xsmith_render-node (ast-child 'l n))
                                ,(att-value 'xsmith_render-node (ast-child 'r n))))])
]
}
@item{@racket['xsmith_find-descendants]

Accepts the node to call it on, then a predicate.
Returns a list of all descendant nodes that satisfy the predicate.

Example:
@racketblock[
(code:comment "This will return a list of all addition and subtraction nodes.")
(code:comment "It will not return any nodes that are subtypes, though!")
(att-value 'xsmith_find-descendants ast-node (λ (n) (member (ast-node-type n)
                                                            '(AdditionExpression
                                                              SubtractionExpression))))
]
}
@item{@racket['xsmith_find-a-descendant]

Accepts the node to call it on, then a predicate.
Like @racket['xsmith_find-descendants], but it only returns one.
If no descendant matching the predicate is found, it returns @racket[#f].
It is mostly useful if you just want to know if there is any descendant matching a type, such as to determine whether to do an analysis based on the presence or absence of some feature.
}


@item{@racket['xsmith_binding]
Accepts the node to call it on and optionally a boolean @verb{require-binder-or-reference} flag (defaults to @racket[#t]).
If the given node is a reference, as determined by the @racket[reference-info] property, then the reference is resolved and a @racket[binding] object is returned.
If the given node is a binder, as determined by the @racket[binder-info] property, then its @racket[binding] object is returned.
For any other node, @racket[#f] is returned when @verb{require-binder-or-reference} is false, otherwise an error is raised.

Example:
@racketblock[
(code:comment "This raises an exception if `n` is not a binder or reference.")
(att-value 'xsmith_binding n)
(code:comment "If `n` is not a binder or a reference, return #f.")
(att-value 'xsmith_binding n #f)
]

The main use of this is to do analyses where you need to look up the declaration node that a reference refers to.
The @racket[binding?] object returned by this function contains a reference to the binder node.
When resolving references, the @seclink["scope-graph"]{scope graph model} is used.
}

@item{@racket['xsmith_ast-depth]

Accepts the node it is called on, and no other arguments.
Returns the tree depth at that node.
Determined by @racket[depth-increase].

Example:
@racketblock[
(att-value 'xsmith_ast-depth n)
]
}
@item{@racket['xsmith_type]
This attribute takes no arguments (besides a @(racr) node) and returns the type (@racket[type?]) of that node.
Note that the type of the node may be a not-yet-concretized type variable.

@racketblock[
(att-value 'xsmith_type n)
]
}
]

@bold{choice-methods}

@itemlist[
@item{@racket['xsmith_get-reference!]
This is a choice method that can be used when creating a reference node.
It uses @racket[reference-choice-info] to choose a reference.
The default value of @racket[reference-choice-info] will cause a fresh appropriate definition to be lifted if none exists.
If no reference can be made (due to a non-default @racket[reference-choice-info]), @racket[#f] will be returned.
Otherwise, it returns a @racket[binding?] of the reference, which you can use @racket[binding-name] on.

Example:
@racketblock[
(att-value 'xsmith_type (current-hole)) (code:comment "-> (fresh-type-variable int bool)")

(code:comment "This will return the name of a variable in scope")
(code:comment "that has type int or type bool.")
(code:comment "It may make a new definition node to do that.")
(define name (binding-name (send this xsmith_get-reference!)))
]

Note that the reference is @emph{choosing} one of the possible types (via @racket[concretize-type]).
Probably you are using this in a @racket[fresh] method to initialize a node.
So you will want to put that name in the node.

Generally you don't need to do this manually, however.
Nodes with the @racket[reference-info] property marked will automatically have their name fields initialized, and nodes with the @racket[binder-info] property marked will automatically have their name and type fields initialized.
}

@item{@racket['xsmith_get-reference-for-child!]
This is a choice method that can be used when creating reference nodes for children during the @racket[fresh] rule.
It takes two additional parameters:
@itemlist[
@item{@italic{type}: @racket[settled-type?] - the type you want the reference to be.  Note that it must be a settled type.  If you want this type to be based somehow on the type of the node in question, use @racket[force-type-exploration-for-node!] so the node's type will be maximally concretely computed before you concretize it.}
@item{@italic{write-reference?}: @racket[boolean?] - whether the reference will be used as a write reference.}
]

It returns a @racket[binding?] of the reference or @racket[#f] if none can be made (due to a non-default @racket[reference-choice-info]).

Example:
@racketblock[
(code:comment "This will return the name of a variable in scope")
(code:comment "with type int for a write reference")
(define name (binding-name (send this xsmith_get-reference-for-child! int #t)))
]

Like @rule[xsmith_get-reference!], it may cause a new definition to be lifted.
Only use the result for children of the @racket[current-hole].
}
]


Node type names, attribute names, and choice-method names are just symbols, so they are not hygienic.
The names of symbols used or defined by the xsmith library start with @verb{xsmith} or @verb{_xsmith}, so don't tread on those namespaces.


@section{Forms for Defining a Grammar and Its Attributes}

@defform[(define-spec-component component-name)]{
Defines a spec component.  Spec components include information about a language grammar and attributes, and can be combined to generate an xsmith fuzzer.  You add grammar productions with @racket[add-to-grammar], you add properties with @racket[add-property], and you can add attributes and choice-methods with @racket[add-attribute] and @racket[add-choice-method], respectively.  Spec components are combined with @racket[define-xsmith-interface-functions].

Example:
@racketblock[
(define-spec-component my-spec-component)
(code:comment "Now use add-to-grammar, add-property, etc.")
(code:comment "Then use my-spec-component in assemble-part-specs.")
]
}

@defform[(define-xsmith-interface-functions
[spec-component ...]
option ...)
#:grammar
[(option (code:line #:fuzzer-name fuzzer-name:id)
         (code:line #:fuzzer-version fuzzer-version-string)
         (code:line #:program-node program-node:id)
         (code:line #:properties [property ...])
         (code:line #:command-line-name command-line-name:id)
         (code:line #:function-name function-name:id)
         (code:line #:comment-wrap comment-wrap-function)
         (code:line #:format-render format-render-function)
         (code:line #:default-max-depth default-max-depth)
         (code:line #:default-type-max-depth default-type-max-depth)
         (code:line #:type-thunks type-thunks)
         (code:line #:features ([feature-name:id feature-default-value:boolean optional-feature-docstring:string] ...))
         (code:line #:extra-parameters ([parameter-name:id parameter-docstring:string parameter-expression param-converter])))
 ]]{
Combines and compiles spec components into a generator.
Defines a function to parse a command line and generate programs.

The fuzzer name defaults to the name of the first spec component provided.
The command-line function is named the id passed to @racket[#:command-line-name], defaulting to @tt{<fuzzer-name>-command-line} with @tt{<fuzzer-name>} replaced appropriately.
The command-line function takes no arguments and parses @racket[current-command-line-arguments].

@racket[#:comment-wrap] takes a list of strings which contain info about the generated program, such as the command line used to generate it, the @racket[#:fuzzer-name], the @racket[#:fuzzer-version], and the random seed number.  It should return a string representing those lines commented out.  Such as the following, assuming the "#" character is the line-comment character in your language:

@racketblock[
(λ (lines)
  (string-join
   (map (λ (x) (format "# ~a" x)) lines)
   "\n"))]

@racket[fuzzer-version] takes a string describing the version of the current fuzzer, used in automatically-generated output.

@racket[#:format-render] is a function which takes the output of your @racket[render-node-info] property as input and should return a string representing the program (perhaps by use of a pretty-printing function).  If your @racket[render-node-info] property produces a string already, you will not need to specify the @racket[#:format-render] argument.

The @racket[#:default-max-depth] sets the “max” tree depth generated.
It's a bit of a fib -- there are situations where Xsmith will allow deeper generation, but they are limited.
Similarly the @racket[#:default-type-max-depth] sets the maximum depth that will be used for type concretization.
In other words, when a type variable is concretized it will only choose base types past that depth.
However, deeper types can still be made through other means.

The @racket[#:type-thunks] argument should be a list or thunk that returns a list of thunks that return a type.
IE @tt{(listof (-> type?))} or @tt{(-> (listof (-> type?)))}.
However, @racket[#f] may be returned instead of a thunk or instead of a type to filter out a given type in some situations.

The @racket[#:features] argument takes a list of lists containing a feature name (as an identifier) and a default value (as a boolean), and optionally a list of documentation strings.
Each feature will be included in the command-line options as @verb{--with-<feature-name>}.
Documentation strings will be displayed in the @verb{--help} text, one per line.
The values of these features is available via @racket[xsmith-feature-enabled?].


@racket[#:extra-parameters] is a list of specifications for extra custom command line parameters.
Each list contains an identifier to be used as the name of the parameter (eg. for use in the command-line interface), a documentation string, a parameter, and a normalization function (or @racket[#f]).
The normalization function should take the string given on the command line and convert it to the value that is actually parameterized during program generation.
The following example defines a @racket["--max-widgets"] parameter with a default of 3:

@racketblock[(define widget-parameter
               (make-parameter 3))

             ...

             (define-xsmith-interface-functions
               ...
               #:extra-parameters ([max-widgets
                                    "The maximum number of widgets"
                                    widget-parameter
                                    string->number]))]

The command-line options generated by default are:
@(command-line-options-table)



Various attributes are automatically defined within the spec, see @secref{generated-rules}.

Properties (defined with @racket[define-property]) are used to derive more @(racr) attributes as well as Xsmith choice-methods, and extra properties can be used in your fuzzer by adding them with @racket[#:properties].
Each property may have a transformer function that alters other properties, attributes, or choice-methods.
All properties referenced within a spec-component are used to generate attributes and choice-methods, as well as any properties specified in the @racket[maybe-properties] list.
Unless values for that property have also been specified within a spec component, properties in the @racket[#:properties] list will only be able to generate rules based on the default value for the property.
Note that any property that has a value attached to a node in the grammar will be run, but if you have defined custom properties that don't have values attached you may need to add the properties to this list.
In other words, if you have a property that defines an attribute on the default node without attaching a value to any node, you need to add it here or the attribute won't actually be added.

}


@defform[(add-to-grammar spec-component grammar-clause ...)
#:grammar [(grammar-clause (node-name parent-name (field ...) maybe-prop ..))
           (parent-name identifier #f)
           (field name/type-id
                  (name/type-id maybe-type-id maybe-kleene-star maybe-init-expr))
           (maybe-type-id (code:line)
                          (code:line : type-name))
           (maybe-kleene-star (code:line) *)
           (maybe-init-expr (code:line) (code:line = init-expr))
           (maybe-prop (code:line #:prop prop-id prop-val))]]{

Adds grammar productions to @racket[spec-component].

@racket[node-name] will be the name of the grammar production in @(racr).
@racket[parent-name] is either the name of the parent grammar production or @racket[#f].

Names for the node and fields are limited to alphabetic characters.  You may want to use camelCase style names since kebab-style or snake_style names are invalid due to this limitation.

Fields are then specified.
Each nonterminal inherits all fields of its parent nodes.
A field has a name, a type, an optional kleene star, and an optional initialization expression.
The type of each field is the name of the nonterminal that it must be or @racket[#f] for fields that may contain arbitrary Racket values.
A field name may be the same as the type, in which case the type does not need to be specified.
If a type is not specified and the name does not match the name of a nonterminal, then the type #f is used.
If the optional kleene star is supplied, the field will be a list field.
If a kleene star is provided for a non-false type, the name and type must be specified separately.

The @racket[init-expr] for each field specifies a default value for the field.
When a node is generated, each @racket[init-expr] is evaluated unless a non-default value is supplied to the generating function.
If no @racket[init-expr] is supplied, the following defaults are used:

@itemlist[
@item{For false types, @racket[#f].}
@item{For nonterminal types, a hole node of the appropriate type.}
@item{For fields with a kleene star, an empty list.}
]

For nodes with a kleene star, @racket[init-expr] may return a list or a number.
If a number is provided, the default value is a list of that many of the non-kleene-star default value.

Example:
@racketblock[
(add-to-grammar
 my-spec-component
 [Expression #f ()]
 [LiteralInt Expression (v = (random 1000))]
 [AdditionExpression Expression ([left : Expression] [right : Expression])]
 [SumExpression Expression ([addends : Expression * = (random 5)])])
]

The example defines a piece of a grammath that includes some kinds of expressions.
When a @verb{LiteralInt} expression is generated, it's @verb{v} field will be populated with a random number.
Since the @racket[random] expression is evaluated for each fresh @verb{LiteralInt}, they will (probably) receive different values for @verb{v}.
When an @verb{AditionExpression} node is generated, it will be populated with an @verb{Expression} hole node for each of its @verb{left} and @verb{right} fields.
When a fresh @verb{SumExpression} is generated, its @verb{addends} field will be populated with a list of zero to four @verb{Expression} hole nodes.
}

@defform[(add-attribute spec-component rule-name rule-clause ...)
#:grammar [(rule-clause (nonterminal-name rule-function))]]{
Adds a @(racr) attribute rule to the spec-component.
The format for each rule is similar to that required by @(racr)'s @verb{ag-rule} form.

Example:
@racketblock[
(add-attribute
 my-spec-component
 interp
 [LiteralInt (λ (n) (ast-child 'v n))]
 [AdditionExpression (λ (n) (+ (att-value 'interp (ast-child 'left n))
                               (att-value 'interp (ast-child 'right n))))]
 [SumExpression (λ (n) (apply + (map (λ (c) (att-value 'interp c))
                                     (ast-children (ast-child 'addends n)))))])
]
}

@defform[(add-choice-method spec-component rule-name rule-clause ...)
#:grammar [(rule-clause (nonterminal-name rule-function))]]{
Adds an Xsmith choice-method to the spec-component.

Xsmith creates a choice object class for each node type in the specification grammar, following the same class hierarchy that AST nodes themselves do.  Choice objects are created every time Xsmith fills in a hole node.  One choice object is created for every node type that is legal to use in filling the hole.  Choice objects are then filtered according to the @racket[choice-filters-to-apply] property, and then the @racket[choice-weight] property of the remaining choice objects is used to determine the probability of choosing each one.  When a choice object is selected, its @racket[fresh] property is used to generate a new AST node of its type.  If all choices are eliminated, an exception is raised with a message stating which filter step invalidated each potential choice.

Choice-methods are methods on the choice objects.  Some choice-methods are used by @racket[choice-filters-to-apply] to filter choices.  Other choice-methods may be used by those filters or in the body of the @racket[fresh] property as helper methods.  While most information about the AST and the current choice are probably computed using attributes, information about choosing a specific node type to fill in an abstract hole (such as an expression hole which may be filled with many different types of expressions) are computed using choice-methods.

Choice-methods are methods in Racket's class system and therefore have the @racket[this] macro available for use in their bodies to access other methods (eg. with the @racket[send] macro).
Choice-methods also have the @racket[current-hole] macro available within their body so that they can query attributes of the @(racr) AST being elaborated (eg. with @verb{att-value} to access attributes and @verb{ast-parent} to inspect other nodes in the AST).

Since choice-methods are methods in Racket's @racket[class] system, they must be defined with a literal @racket[lambda] (with no parameter for the implicit @racket[this] argument).  If a method needs to modify state (such as to cache the computation of available references of the appropriate type), I would normally recommend the “let-over-lambda” pattern, but that is not allowed in this case.  To make up for this, I recommend using @racket[make-weak-hasheq] to hold the state, using the @racket[this] object as a key.


This is a poor example, but it demonstrates how attributes and choice-methods can be used together to help make choices:
@racketblock[
(add-choice-method
 my-spec-component
 my-weight-helper
 [#f 7]
 [AdditionExpression
  (λ () (if (att-value 'my-helper-attribute (current-hole))
            20
            5))])
(add-property
 my-spec-component
 choice-weight
 [#f (send this my-weight-helper)])
]

}

@defform[(add-property spec-component prop-name prop-clause ...)
#:grammar [(prop-clause (nonterminal-name prop-value))]]{
Adds property values to the spec-component.

Since property transformers are macros that may accept arbitrary domain-specific syntax, the grammar of prop-value varies for each property.
}


@defform[(current-hole)]{
Within the body of a choice-method, @racket[(current-hole)] returns the hole node being considered for replacement.
This allows choice-methods to query attribute attributes of the grammar.

Elsewhere it raises a syntax error.
}

@defform[(make-hole hole-type-expression)]{
Within the context of a spec component (eg. in the body of @racket[add-attribute], @racket[add-property], @racket[add-to-grammar], etc), @racket[make-hole] is a function to generate a hole of a given type.

For example, to make a hole node that will eventually be replaced with some type of @verb{Expression} node:
@racketblock[(make-hole 'Expression)]

This function is essentially used by @racket[add-to-grammar] as the default value for grammar fields with nonterminal types that lack an init-expr.

Outside of a spec component context, it raises a syntax error.
}

@defform[(make-fresh-node node-type-expression optional-field-value-dict)]{
Within the context of a spec component (eg. in the body of @racket[add-attribute], @racket[add-property], @racket[add-to-grammar], etc), @racket[make-fresh-node] is a function to generate a fresh node of the given type.
Construction of the new node is guided by the @racket[fresh] property.

For example, to generate a fresh @verb{AdditionExpression} node, specifying values for some of its fields:
@racketblock[(make-fresh-node 'AdditionExpression
                               (hash 'left (make-fresh-node 'LiteralInt
                                                            (hash 'v 5))))]

Note that the fresh node is initially created unattached to the rest of the program tree.
This means that any nodes whose @racket[fresh] implementation needs to inspect the tree may fail.
In particular, reference nodes can only lift bindings when attached to the tree, and will probably fail if created with @racket[make-fresh-node].
}

@section{Custom Properties}

Properties are used to create domain-specific languages or terse declarative syntax for specifying attributes and choice-methods.
Custom properties are probably only useful for shared infrastructure for multiple languages (perhaps with the exception of @racket[define-non-inheriting-rule-property]).


@defform[(define-property name
                          maybe-dups
                          maybe-reads
                          maybe-rewrites
                          maybe-appends
                          maybe-transformer)
#:grammar [(maybe-dups (code:line)
                       (code:line #:allow-duplicates? literal-bool))
           (maybe-reads (code:line)
                        (code:line #:reads transformer-arg-spec ...))
           (maybe-rewrites (code:line)
                           (code:line #:rewrites transformer-arg-spec ...))
           (maybe-appends (code:line)
                          (code:line #:appends transformer-arg-spec ...))
           (maybe-transformer (code:line)
                              (code:line #:transformer transformer-function))
           (transformer-arg-spec (property prop-name)
                                 (attribute rule-name)
                                 (choice-method rule-name)
                                 (grammar))
           ]]{
Defines a property for use with @racket[add-property].

Properties can have a transformer function that produce attributes, choice-methods, or other property values.
Property transformers can read the values set for the property by @racket[add-property], and optionally the values of other properties as well.
Not all properties must have a transformer -- some properties may just be a single place to store information that is used by (multiple) other properties.

The transformer function accepts a dictionary mapping grammar node names (as symbols) to the syntax objects from the right-hand-side of @racket[add-property] uses for each property that it reads.
The transformer function must return a list of dictionaries mapping grammar node names (as symbols) to syntax objects for the properties, attributes, and choice-methods that it writes.

Property transformers are run during @racket[define-xsmith-interface-functions] in an order determined by the read/write dependencies among the properties.
Appending goes before rewriting, which goes before reading.

If a property appends (using the #:appends keyword) to a property or rule, its return dictionary will be appended to the existing dictionary for that property/rule.
This allows a property or rule to be specified in part by the property that appends and in part by another property or the user.
If an appended rule (or property that disallows multiple values) ends up with two values for any node, an error is raised.

If a property rewrites (using the #:rewrites keyword) to a property or rule, its return dictionary replaces any previous dictionary for that property/rule.
Rewriting a property also automatically implies reading that property.
A property or rule may only be rewritten by one property transformer.

Example showing the arguments and return type needed by transformers:
The transformer argument order is its own property, then #:reads dictionaries in the order declared, then #:rewrites dictionaries in the order declared.
The transformer return order is #:rewrites dictionaries in the order declared then #:writes dictionaries in the order declared.
@racketblock[
(define-property my-property
  #:reads (property a) (property b)
  #:rewrites (property c)
  #:appends (attribute d) (choice-method e)
  #:transformer
  (λ (this-prop-dict prop-a-dict prop-b-dict prop-c-dict)
    (code:comment "compute output dictionaries...")
    (list dict-for-c dict-for-d dict-for-e)
    ))
]

The syntax object value for a property can be anything, since property transformers define the grammar and semantics of properties.

The syntax object value for @verb{attributes} and @verb{choice-methods} should be a syntax object specifying a function (IE a @racket[lambda]).
@verb{attributes} may be any syntax that evaluates to a function (so you may return an identifier that references a function or an expression that computes a function such as let-over-lambda), but choice-method syntax is provided to Racket's @racket[class] macro, which requires literal @racket[lambda] forms.

@; TODO - internal properties read the grammar, but the API for that is horrible.
@;The syntax object value for grammar productions when @verb{(grammar)} is read is a syntax object of class @racket[grammar-clause].

Dictionaries may or may not contain an entry for each nonterminal in the grammar.
@;(Except the grammar dictionary which always contains all nonterminals.)
A dictionary may even be empty.

In addition to nonterminals, each dictionary may include a mapping for the value @racket[#f], which will define a default value used for the (super secret) parent node that @racket[define-xsmith-interface-functions] defines.
If nothing is specified for #f, attributes and choice-methods will have a default which errors, providing a helpful error message.

If the @racket[#:allow-duplicates?] argument is supplied and is @racket[#t], then @racket[add-property] may be used more than once for the property for the same node, and the syntax object in the dictionary for the property will be a syntax list of the syntax objects specified in the various @racket[add-property] calls.
But by default only one use of @racket[add-property] is allowed per property per node type, and the syntax object in the dict is the single syntax object from the one call.

Here is a simple example that basically desugars straight to an attribute with a default:

@racketblock[
(define-property strict-child-order?
  #:appends (attribute _xsmith_strict-child-order?)
  #:transformer
  (λ (this-prop-info)
    (define _xsmith_strict-child-order?-info
      (hash-set
       (for/hash ([(n v) (in-dict this-prop-info)])
         (values n (syntax-parse v [b:boolean #'(λ (n) b)])))
       #f #'(λ (n) #f)))
    (list _xsmith_strict-child-order?-info)))
]

The rationale for this example property is to:
@itemlist[
@item{Allow values to be specified by just @racket[#t] or @racket[#f], rather than @racket[(λ (n) #t)]}
@item{To implicitly provide a default value to the root node (@racket[#f]).}
]

For more realistic examples of properties, see the file @verb{private/core-properties.rkt} in the Xsmith implementation.
Generally they are big, hairy macros.

}

@defform[(define-simple-property property-name rule-type optionals)
#:grammar [(rule-type attribute choice-method)
(optionals (code:line)
           (code:line #:rule-name rule-name)
           (code:line #:default default)
           (code:line #:transformer transformer))]]{
This provides a simpler interface to quickly define properties that have no dependencies on other properties.

Defines a property that generates an attribute or choice-method according to @racket[rule-type].
The optional @racket[#:default] value will be associated with the implicit @racket[#f] node (and thus be inherited by other nodes unless overrided).
The default is given as literal syntax (IE no hash-quote or @racket[syntax] form needed).
The optional transformer should be a syntax parser, or in other words, it must be a function from a syntax object to a syntax object.
The default transformer is the identity function, meaning the rule is written verbatim.
Note that using the default transformer means there is little reason to use a property, since you can just use @racket[add-attribute] or @racket[add-choice-method] directly instead.

By default the name of the generated @verb{attribute} or @verb{choice-method} is the same as the name of the property, but this may be overrided by providing @racket[#:rule-name].
}

@defform[(define-non-inheriting-rule-property property-name
                                              rule-type
                                              maybe-rule-name
                                              default-value
                                              maybe-transformer)
#:grammar [(maybe-rule-name (code:line)
                            (code:line #:rule-name rule-name))
           (default-value (code:line #:default default-expr))
           (maybe-transformer (code:line)
                              (code:line #:transformer transformer-func))]]{
Like @racket[define-simple-property], but it defines a property that generates an attribute or a choice-method that does NOT inherit its implementation from its superclass.

@racket[rule-type] must be either @verb{attribute} or @verb{choice-method}.

@racket[rule-name] defaults to @racket[property-name], but you can make it give the rule a different name than the property.

@racket[default-expr] is the default value of the property.  Any nonterminal that does not have a different value specified gets this one.  Note that the default is required, not optional, despite being a keyword argument.

@racket[transformer-func] is an optional transformer to transform the value.
It is not called with a dictionary like the transformers of @racket[define-property], but rather it receives each value individually.
This allows a small amount of sugar.
Note that the value supplied as the @racket[default-expr] is also transformed by the @racket[transformer-func] when it is supplied.
When no @racket[transformer-func] is supplied, values are passed through literally.

Example:
@racketblock[
(define-non-inheriting-rule-property
  some-bool-flag
  attribute
  #:default #f
  #:transformer (syntax-parser [#f #'(λ () #f)]
                               [#t #'(λ () #t)]))
(add-to-grammar
 a-spec-component
 [A #f ()]
 [B A ()]
 [C B ()])
(add-property
 a-spec-component
 some-bool-flag
 [A #t]
 [C #t])
]

Normally @verb{B} would inherit a method from @verb{A} when none was specified for it.
But in this case it inherits the default (@racket[#f]).
When a user tries @verb{(att-value 'some-bool-flag <node-of-type-B>)} it will return @racket[#f], not @racket[#t].
}


@;@subsection{Helpers for creating properties}
@;
@;TODO - the APIs in this section are not very good.  Frankly I would like to rewrite them all.  Also, I don't think this section is important for a first version.  So I'm considering commenting the whole section out.
@;
@;@defform[#:kind "syntax class" #:id grammar-clause grammar-clause]{
@;This is a syntax class used for parsing grammar clauses.  If you parse one with @racket[syntax-parse], you will have access to the following fields:
@;
@;@itemlist[
@;@item{@verb{node-name} - the node name as an identifier}
@;@item{@verb{parent} - the parent node name as an identifier or @racket[#f] (as syntax, not as a bare @racket[#f]) if the node has no parent.}
@;@item{@verb{component ...} - a list of @racket[grammar-components]}
@;]
@;
@;TODO - (properties and the grammar) should this be in the public interface?  Yes if I allow properties to change the grammar, but otherwise I'm not sure.  For reading, it would probably be easier to get a dict full of a struct with: node name, chain of parent nodes, field list (of grammar-node-field-structs).  It will also be faster if I construct that once, because in various properties I am reconstructing some of that info multiple times.
@;
@;Example:
@;@racketblock[
@;(syntax-parse grammar-stx
@;  [x:grammar-clause
@;   (code:comment "This basically reconstructs it.")
@;   #'(x.node-name x.parent (x.component ...))])
@;]
@;}
@;@defform[#:kind "syntax class" #:id grammar-component grammar-component]{
@;This is a syntax class used for parsing the fields of @racket[grammar-clause]s.
@;If you parse one with @racket[syntax-parse], you will have access to the following fields:
@;@itemlist[
@;@item{@verb{name} - the field name as an identifier}
@;@item{@verb{type} - optional: the field type as an identifier if there is one, otherwise @racket[#f] (IE use @racket[attribute] to check if it is present).}
@;@item{@verb{kleene-star} - optional: literal @verb{*} character as an identifier if the field is a list type, otherwise @racket[#f] (IE use @racket[attribute] to check if it is present).}
@;@item{@verb{init-expr} - optional: syntax object for the initial expression if present, otherwise @racket[#f] (IE use @racket[attribute] to check if it is present).}
@;]
@;
@;TODO - the name should be changed to grammar-field or grammar-clause-field or something.
@;
@;TODO - (properties and the grammar) should this be in the public interface?  As with @racket[grammar-clause], I think it should only if properties can add to the grammar.  As with the TODO comment in @racket[grammar-clause].
@;}
@;
@;@defthing[grammar-clause->parent-chain any/c]{
@;TODO - (properties and the grammar) this should probably not be in the public interface.
@;
@;In general I should provide better ways to inspect the grammar in property transformers, but I haven't needed them yet...
@;}
@;
@;@defproc[(grammar-node-name->field-info-list [name symbol?] [grammar-clause-hash any/c])
@;         (listof/c grammar-node-field-struct?)]{
@;This function should be called with a node type name and the grammar object given as an argument to a property transformer that reads the grammar.
@;
@;It returns a list of structs containing information about the node's fields (including fields inherited from super node types).
@;
@;TODO - (properties and the grammar) this should probably not be in the public interface
@;}
@;
@;@defstruct[grammar-node-field-struct
@;           ([name symbol?]
@;            [type (or/c symbol? #f)]
@;            [kleene-star? boolean?]
@;            [init-expr syntax?])
@;           #:omit-constructor]{
@;The struct type in the list returned by @racket[grammar-node-name->field-info-list].
@;
@;TODO - (properties and the grammar) Properties asking to read the grammar should get a dict full of grammar clause structs that include a list of these.
@;}



@section{Refiners}

Xsmith supports iterative refinement, which is a technique that allows for the modification of an existing AST.
There are no default refiners, and their use is entirely optional.
You might use a refiner to handle simplifications due to a post-generation analysis, such as using native mathematical operations instead of a specialized library when it has been proven that such operations will not produce unsafe results.

@defform[(define-refiner spec-name
                         refiner-name
                         maybe-follows
                         maybe-refiner-predicate
                         maybe-global-predicate
                         refiner-clause ...)
#:grammar [(maybe-follows (code:line)
                          (code:line #:follows refiner-name ...))
           (maybe-refiner-predicate (code:line)
                                    (code:line #:refiner-predicate pred))
           (maybe-global-predicate (code:line)
                                   (code:line #:global-predicate pred))]]{

Each @racket[refiner-clause] is a list of functions that each take exactly one argument: a node.
The final function in the list (the "refiner function") must return an AST node which will replace the node that was passed in to the function.
Each preceding function is a predicate that determines whether the refiner function will be run.
The list need only provide a refiner function; the predicate functions are optional.

Refiners can have an order imposed on them if so desired.
You can specify the @racket[#:follows] parameter, which accepts either a refiner name or a list of refiner names.
All refiners will be sorted based on these @racket[#:follows] declarations, and executed in the order determined.

If you want to predicate the execution of a refiner itself, you can use the @racket[#:refiner-predicate] parameter.
This parameter accepts a single function that, if given, will be run prior to attempting to run the refiner at all.
If it returns anything other than @racket[#f], the refiner will be run.
The function should be a thunk (i.e., it should accept zero arguments).
This is useful if you wish to have a refiner toggled on or off by a command-line argument.

Additionally, you may wish to have some predicate shared among all clauses of a refiner.
The @racket[#:global-predicate] parameter allows for this.
Similar to @racket[refiner-predicate], @racket[#:global-predicate] accepts a single predicate function as argument.
This function will be prepended to the list of functions for each clause.

The @racket[refiner-clause]s are similar to those used by the @racket[add-property] (and similar) functions.
One key difference with refiners is that there is always a @racket[#f] clause given.
(The default implementation of this clause if omitted by the user is @racket[[(λ (n) #f)]], which means the refiner has no effect for any node which does not have a matching clause defined.)
The predicate defined by @racket[#:global-predicate] (if present) will not ever be applied to the @racket[#f] clause.
This means that if you want its functionality to be applied to nodes which do not have matching clauses, you will need to use it in a custom @racket[#f] clause.


@racketblock[
(define-spec-component arith)

(add-to-grammar
 arith
 (code:comment "...")
 [ArithOrVal #f ()]  (code:comment "This simplifies choices for generation.")
 [Val ArithOrVal ([v = (random -100 100)])]  (code:comment "Integer values on the specified")
                                             (code:comment "interval (-100, 100).")
 [ArithOp ArithOrVal ([lhs : ArithOrVal]    (code:comment "Arithmetic operations have a")
                      [rhs : ArithOrVal])]  (code:comment "left-hand side and a right-hand")
                                            (code:comment "side, which can be either a value")
                                            (code:comment "or another arithmetic operation.")
 [AddOp ArithOp ()]  (code:comment "+")
 [SubOp ArithOp ()]  (code:comment "-")
 [MulOp ArithOp ()]  (code:comment "*")
 [DivOp ArithOp ()]  (code:comment "/")
 [ModOp ArithOp ()]  (code:comment "%")
 (code:comment "...")
 )

(code:comment "... (more code here)")

(define-refiner
  arith
  make-even
  (code:comment "This refiner makes all literal integer values into even values by")
  (code:comment "incrementing them by 1. The `#:refiner-predicate` parameter says that it")
  (code:comment "will only be run when the `make-even` feature is enabled. This does")
  (code:comment "require the make-even feature to be defined in the `define-xsmith-interface-functions`")
  (code:comment "macro.")
  #:refiner-predicate (λ () (xsmith-feature-enabled? 'make-even))
  [Val [(λ (n) (odd? (ast-child 'v n)))
        (λ (n) (make-replacement-node
                 'Val
                 n
                 (hash 'v (+ 1 (ast-child 'v n)))))]])

(define-refiner
  arith
  replace-rhs-val-with-zero
  (code:comment "The `replace-rhs-val-with-zero` refiner looks at the right-hand side of")
  (code:comment "every ArithOp to see if it's a Val. If it is, its internal value will")
  (code:comment "be replaced with 0.")
  #:global-predicate (λ (n) (node-subtype? (ast-child 'rhs n) 'Val))
  [ArithOp [(make-replacement-node
              (node-type n)
              n
              (hash 'rhs (make-fresh-node
                           'Val
                           (hash 'v 0))))]])

(define-refiner
  arith
  prevent-divide-by-zero
  (code:comment "To prevent divide-by-zero errors, this refiner looks for generated")
  (code:comment "DivOps and checks whether their right-hand side is a 0. If it is, it")
  (code:comment "is replaced with a 13.")
  #:follows:replace-rhs-val-with-zero
  #:refiner-predicate (λ () (xsmith-feature-enabled? 'prevent-divide-by-zero))
  [DivOp [(λ (n) (node-subtype? (ast-child 'rhs n) 'Val))
          (λ (n) (eq? 0 (ast-child 'v (ast-child 'rhs n))))
          (λ (n) (make-replacement-node
                   'DivOp
                   n
                   (hash 'rhs (make-fresh-node
                                'Val
                                (hash 'v 13)))))]])

(code:comment "... (more code here)")

(define-xsmith-interface-functions
 [arith]
  (code:comment "... (other options specified here as needed)")
  #:features ([make-even #f]
              [prevent-divide-by-zero #t]))
]
}



@defform[(make-replacement-node new-node-type
                                original-node
                                [children])
 #:grammar [(children (hash child-name child-value ...))]]{

During refinement, you may wish to replace an entire node with a new node of a different type.
@racket[make-replacement-node] handles all the logic involved in ``replacing'' a node in RACR for you.

A @racket[new-node-type] must be given, which is a symbol corresponding to a node type in the grammar.
Then, @racket[original-node] is the node which is being replaced.
The new node will completely take the place of the old node; there will no longer be any references to @racket[original-node] in the AST.
Additionally, all children of the @racket[original-node] will be copied to the new node except those replaced by the @racket[children] hash.
This is simply a hash of child names to values, just like in @racket[make-fresh-node].

@racketblock[
(define-refiner
 my-spec-component
 replace-plus-with-minus
 [PlusOp [(λ (n) (make-replacement-node 'MinusOp n))]])
]

Note that @racket[make-replacement-node] should only be used in the bodies of refiner functions!
If a replacement is made but the refiner fails for some other reason (i.e., it returns @racket[#f]), the replacement will be undone.
}



@section{Scope Graph Functions}

This section describes the API to the @seclink["scope-graph"]{Scope Graph} model of binding.

@defstruct[binding ([name string?]
                    [ast-node ast-node?]
                    [type type?]
                    [def-or-param (or/c 'definition 'parameter)])
                   #:omit-constructor]{
Struct for binding information of nodes that create bindings.

Notably this is returned by the attribute @rule[xsmith_binding].

The @racket[ast-node] field is the grammar node containing the definition of @racket[name].
The @racket[def-or-param] field is there to distinguish names that are bound as function parameters vs names that are bound as (local or global) definitions.

Probably all you need to do with this, though, is get the @racket[name] field and stuff it in the name field of your definition nodes in the @racket[fresh] method.
}

@defparam[current-well-formedness-regexp r regexp?
          #:value #px"rp*i?d"]{

For most languages you probably don't need to fuss with this.

When the @rule[xsmith_binding] attribute is used or when Xsmith searches for a valid reference with @rule[xsmith_get-reference], this regexp determines valid scope-graph resolution paths.
The path elements (reference, parent, import, definition) are turned into characters (r, p, i, and d respectively).
If the path from reference to definition matches this regexp, it is valid.
If two definitions have the same name and paths from a reference to both definitions are valid, the definition that is in scope for the reference is determined by @racket[current-path-greater-than].

Because Xsmith doesn't currently support import elements at all, modifying this regexp is somewhat of a moot point.

}

@defparam[current-path-greater-than comparator
         (-> (listof/c (or/c 'reference 'parent 'import 'declaration))
             (listof/c (or/c 'reference 'parent 'import 'declaration))
             any/c)]{

If there are two valid resolution paths (determined by @racket[current-well-formedness-regexp]) for a name, this comparator determines which path is chosen.
The comparator must return a non-false value if the left operand is greater than the right, otherwise @racket[#f]
The greatest path is chosen.

By default the comparator walks down the paths, comparing each element.
A path is greater than another if its first differing element is greater.
From least to greatest, path elements are @racket['reference], @racket['parent], @racket['import], @racket['declaration].

For most languages you probably don't need to fuss with this.
}

@;@subsection{scope-graph.rkt -- other things provided but that maybe aren't part of the public interface}
@;@itemlist[
@;@item{resolve-reference -- used by the resolve-reference attribute}
@;@item{visible-bindings -- used by the xsmith_visible-bindings attribute}
@;@item{scope -- struct used by basically all of the functions, but should be invisible to end-user}
@;@item{reference -- struct used internally like scope}
@;@item{module -- struct that I made for paths that include imports, but I haven't actually used that yet, so it is just languishing...}
@;]



@section{Core Properties}

These properties are available for specification on Xsmith grammars.
Many of them define or modify the behavior of core functionality, such as @racket[fresh] modifying how fresh program nodes are instantiated, or @racket[type-info] defining a language's type rules.

Many are optional, but for any language at all you probably need to use @racket[type-info] and @racket[may-be-generated].
Additionally, for any language with variables you need @racket[binder-info] and @racket[reference-info].
Next, the @racket[fresh] property will likely become necessary for a few fields that Xsmith can't infer.
Then the most important of the truly optional properties is likely @racket[choice-weight].




@defform[#:kind "spec-property" #:id may-be-generated may-be-generated]{
Acceptable values for this property are @racket[#t] or @racket[#f], and the default is @racket[#t].

If may-be-generated is false, the node is not added to the list of possibile choices to replace an appropriate AST hole.
It is useful to set it to false for abstract node types or for specialized node types that are meant to be swapped in only after a full tree is generated, such as by a later analysis to determine validity of an unsafe operation.
This property is NOT inherited by subclasses.

Example:
@racketblock[
(add-property
 my-spec-component
 may-be-generated
 (code:comment "Expression is abstract and should not be instantiated,")
 (code:comment "only AdditionExpression, SubtractionExpression, etc.")
 [Expression #f]
 (code:comment "Only safe addition expressions should be generated,")
 (code:comment "but maybe a later pass after generation swaps some")
 (code:comment "safe addition expressions for unsafe ones after analysis.")
 [UnsafeAdditionExpression #f])
]

@; TODO - should I rename this to `abstract` (and flip its boolean interpretation)?  For most uses that probably makes sense.  Or better yet, perhaps I should add a separate `abstract` property that implies that it may not be generated, but also makes sure the node never ends up in the tree by any means.  As it is Cish uses nodes that may not be generated but may be added in a second pass (the unsafe math nodes).
}



@defform[#:kind "spec-property" #:id type-info type-info]{

This property is used to specify the type system used by the generator.
You should specify a type system even for dynamically typed languages so that programs don't just crash with dynamic type errors.

Example:
@racketblock[
(define number (base-type 'number))
(define int (base-type 'int number))
(define float (base-type 'float number))
(define bool (base-type 'bool))
(add-property
 my-spec-component
 type-info
 [AdditionExpression [(fresh-subtype-of number)
                      (λ (n t) (hash 'l t 'r t))]]
 [SubtractionExpression [(λ (n) (fresh-subtype-of number))
                         (λ (n t) (hash 'l t 'r t))]]
 [EqualityExpression [bool
                      (λ (n t)
                        (define arg-type (fresh-type-variable))
                        (hash 'l arg-type
                              (code:comment "You could compute more by using")
                              (code:comment "a function and computing the type")
                              (code:comment "given the actual child node.")
                              'r (λ (r-node) arg-type)))]]
 [Lambda [(function-type (fresh-type-variable) (fresh-type-variable))
          (λ (n t) (hash 'arg (function-type-arg-type t)
                         'Expression (function-type-return-type t)))]])
]

The property is two armed.
The first arm specifies the type(s) that a node can inhabit, and its value is @racket[unify!]-ed with the node's type during type checking.
The second arm specifies a relationship between a parent node and its children.
Children nodes are allowed to be subtypes of the type specified by their parent, so a node's type is @racket[subtype-unify!]-ed with the type specified by its parent during type checking.

The first part is the type (or partially-constrained type variable) that the given node can inhabit.
The expression given is evaluated fresh every time a node is type checked or considered for generation.
When determining the type of a node, this value is @racket[unify!]-ed with this value.

Additionally, the expression for a node's type constraint may be a function that takes the node and returns a type instead of a type directly.
However, the function may be passed a hole node that does not yet have properly populated fields and that may be a supertype of the node type you are defining the constraint for.
So if you use a function here, you need to check the node with @racket[(att-value 'xsmith_is-hole? node)]

The second part is a function that takes a node, its type, and must return a dictionary mapping its children nodes to types.
The dictionary keys may be the node objects of the node's children OR the symbol of the field name the child inhabits.
The values in the hash may be types or functions from node to type.
For kleene-star children, if they are all supposed to have the same type, it is convenient to use the field name as the key.
However, if a kleene-star child needs a different type for each element of the list, you should use the nodes themselves as keys or use a function as dictionary value.

Note that the type argument of the function may always be a type variable.
Even if the only possible type that could come in is, say, the @tt{int} type, it may come in wrapped as a type variable that has been unified such that @tt{int} is the only possibility, but it is not @racket[equal?] to the @tt{int} type.
To deconstruct types, you should create the type you want to deconstruct, @racket[unify!] it with the type you want to deconstruct, then deconstruct the one you built.
For example:
@racketblock[
(add-to-grammar
 my-spec-component
 [Cons Expression ([newval : Expression] [list : Expression])])
(add-property
 my-spec-component
 type-info
 [Cons [(list-type (fresh-type-variable))
        (λ (n t)
          (define lt (fresh-list-type))
          (unify! lt t)
          (define inner (list-type-type lt))
          (hash 'newval inner 'list t))]])
]

If the typing function makes any decisions about the type of the parent node when deciding the type of the child nodes, the decision needs to be reflected by either some unification with the child node types, construction/deconstruction using the parent type, or by saving the decision in the parent node and unifying with it in future passes.
Here's an example of a subtly broken type rule:

@racketblock[
(add-to-grammar
 my-spec-component
 [Convert Expression (Expression)])
(add-property
 my-spec-component
 type-info
 [Convert [(fresh-type-variable int float)
           (λ (n t)
             (cond
               (code:comment "Don't do this.")
               [(can-unify? t int) (hash 'Expression float)]
               [(can-unify? t float) (hash 'Expression int)]))]])
]

This example uses the input type to determine an output type, and in fact makes a decision about what the input type is.
But this decision isn't saved.
The problem is that @tt{t} could be a concrete @tt{int}, a concrete @tt{float}, or it could be a type variable that could be unified with either!
In the first two cases this type rule is fine, but in the third case it will effectively decide that the type is an @tt{int}, and generate a @tt{float} child expression.
Because it doesn't unify and save this decision, generation in another branch of code could then unify the variable with @tt{float}, which would then cause a type error when this type function is re-run.

To fix it, there are two choices.
@itemlist[
@item{
You could split @tt{Convert} into two rules that accept one type each.
@racketblock[
(add-to-grammar
 my-spec-component
 [IntToFloat Expression (Expression)]
 [FloatToInt Expression (Expression)])
(add-property
 my-spec-component
 type-info
 [IntToFloat [int (λ (n t) float)]]
 [FloatToInt [float (λ (n t) int)]])
]

This solves the problem by making the type choice explicit in the requirements to generate the conversion node in the first place.
}
@item{
You could save the choice in a field of the @tt{Convert} node and re-unify in later passes.

@racketblock[
(add-to-grammar
 my-spec-component
 [Convert Expression (Expression outputtype)])
(add-property
 my-spec-component
 type-info
 [Convert [(fresh-type-variable int float)
           (λ (n t)
             (define it (ast-child 'outputtype n))
             (when it
               (unify! t it))
             (define (save-choice chosen-type)
               (when (not it)
                 (enqueue-inter-choice-transform
                  (λ ()
                    (rewrite-terminal 'outputtype n chosen-type)))))
             (cond
               [(can-unify? t int)
                (save-choice int)
                (hash 'Expression float)]
               [(can-unify? t float)
                (save-choice float)
                (hash 'Expression int)]))]])
]
}
]

Note that the complexity of the @tt{Convert} example arises because the input type has a relationship with the output type that can't be expressed in terms of a construction, decomposition, or unification of types.
}




@defform[#:kind "spec-property" #:id fresh fresh]{
This property determines how fresh nodes are constructed (by the @racket[make-fresh-node] function).

Acceptable values for this property are expressions which produce a @racket[dict?] object, or expressions which produce a function of type (-> dict? dict?).  Keys of the dictionary must be field names of the node being generated.  The values in the dictionary are used to fill node fields of the appropriate name.  Any field whose name is not in the dictionary will be filled by evaluating the default init-expr defined in the grammar (via @racket[add-to-grammar]).

Example:
@racketblock[
(add-to-grammar
 my-spec-component
 [Expression #f ()]
 [LiteralInt Expression (v = (random 1000))]
 [AdditionExpression Expression ([left : Expression] [right : Expression])])
(add-property
 my-spec-component
 fresh
 (code:comment "Make AdditionExpressions always be generated with a literal 7 argument.")
 [AdditionExpression (hash 'left (make-fresh-node LiteralInt (hash 'v 7)))])
]

This is useful for fields that must be determined together.  For example, a function call needs the function name and the number of arguments to be chosen together rather than independently.

As with all choice-methods, @racket[this] and @racket[current-hole] are available for use in expressions, which you may want to do for eg. accessing available bindings or mutable information connected to the choice object.

If the result is a procedure instead of a dictionary, that procedure must accept and return a dictionary.  It is called with a dictionary that is empty unless the node being created is the result of lifting a definition.  In that case it will have the appropriate name and type fields with the name and type chosen by the lifting mechanism.  In the case of lifting a definition, the name and type fields in the return dictionary are ignored.  This procedure option is allowed because your fresh expression may need access to the name or type to determine the values of other fields.  If a definition node only has a name and type field then a fresh property is unnecessary when lifting, and if lifting is the only way you generate definitions then fresh properties or initializers for definition nodes are unnecessary.

If the value for a field (IE values inside the result dictionary) is a procedure, it will be called with 0 arguments.  This allows the fresh property to provide a default value that is not evaluated when @racket[make-fresh-node] is called with an appropriate value.

}




@defform[#:kind "spec-property" #:id choice-weight choice-weight]{
This property determines the probability that different kinds of nodes will be chosen.  When choices have been filtered (based on @racket[choice-filters-to-apply]), one of the remaining choices is chosen at random with probability (choice-weight / sum-of-choice-weights).

The expression provided as the choice weight will be evaluated in the context of a method call, so @racket[this] and @racket[current-hole] are available.

Choice weights should be positive integer values.  The default weight is 10 unless set explicitly.

Example:
@racketblock[
(add-property
 my-spec-component
 choice-weight
 (code:line "The default choice weight.")
 [#f (λ () 10)]
 (code:line "Generate more AdditionExpressions")
 [AdditionExpression 20]
 [MultiplicationExpression 15]
 (code:line "Generate fewer SumExpressions")
 [SumExpression 5])
]
}



@defform[#:kind "spec-property" #:id depth-increase depth-increase]{
This property defines the @rule[xsmith_ast-depth] non-inheriting attribute.

The property accepts an expression which much evaluate to a function of one argument (the @(racr) AST node) which returns an integer for the depth increase.
The default is @racket[(λ (n) 1)].
This property is NOT inherited by subclasses.

This is useful to allow node re-use.  For example, the body of an @verb{if} or @verb{for} statement might be a block and have the same semantics, but you might want a block inside an @verb{if} to only be considered a depth increase of 1, not 2.

Example:
@racketblock[
(define no-depth-if-body-is-block
  (λ (n) (if (node-subtype? (ast-child 'body n) 'Block) 0 1)))
(add-property
 my-spec-component
 depth-increase
 [IfStatement no-depth-if-body-is-block]
 [ForStatement no-depth-if-body-is-block])
]
}

@defform[#:kind "spec-property" #:id wont-over-deepen wont-over-deepen]{
The default for this property is probably what you want, so probably just be sure to add this to the extra #:properties flag of @racket[assemble-part-specs].

But if you want to set it:

The property accepts expressions which will evaluate to booleans (IE anything but only #f is false...), which are evaluated if the choice is made at the point where the AST is at it maximum depth.  A true value means that the choice is acceptable, false otherwise.  The default is computed by checking whether a node includes AST-node-typed fields.  If it does not it is considered atomic and therefore acceptable to choose when the AST is already at its maximum depth.
}


@defform[#:kind "spec-property" #:id binder-info binder-info]{
This property is used to mark nodes that define bindings.
The property consists of a list of optional keyword arguments.
@itemlist[
@item{
@racket[#:name-field] specifies the name of a field in the node that contains the name of the definition, and defaults to @verb{name}.
}
@item{
@racket[#:type-field] specifies the name of a field in the node that contains a type annotation for the definition, and defaults to @verb{type}.
}
@item{
@racket[#:binder-style] must be either @verb{definition} or @verb{parameter}, reflecting whether the binding is a function parameter (default: @verb{definition}).
This is used by some Xsmith analyses about higher order values.
}
@item{
@racket[#:lift-target?] defaults to @racket[#t].  When @racket[#:lift-target?] is true @emph{and} the binding style is @verb{definition} and not @verb{parameter} then the binder is eligible to be lifted automatically.
}
]

Note that for a basic definition with default name fields, the property need only contain an empty list to mark that the node is in fact a binder.

Example:
@racketblock[
(add-to-grammar
 my-spec-component
 [Definition #f (name type Expression)
   #:prop binder-info ()]
 [FormalParameter #f (name type)
   #:prop binder-info (#:binder-style parameter)]
 [Reference #f (name)
            #:prop reference-info (read)])
]

Note that when definitions are lifted automatically, the @verb{name} and @verb{type} fields are given automatically.
But if definition nodes are filled in normally (not via lifting), values must be provided by the @racket[fresh] property.
}
@defform[#:kind "spec-property" #:id reference-info reference-info]{
This property marks nodes that are reference nodes.
The argument for the property is a list containing:

@itemlist[
@item{The identifier @verb{read} or the identifier @verb{write}, indicating whether the reference reads or writes the variable}
@item{(Optional) @racket[#:name-field]:  The name of the field that stores the reference name (as an identifier).  Defaults to @verb{name}.}
@item{(Optionaly) @racket[#:unifies]:  Accepts the name of a field that the type checker unifies with respect to.  The default value, @racket[#t], unifies the node itself instead of one of its fields.  Use @racket[#f] to disable automated unification for this node.  (If you disable unification, you should implement your own manually!)

If you give a field for the unification target, that field's type rule must NOT depend on the parent node's type.
Essentially, the @racket[#:unifies] argument is meant for writes to any type in a node that itself has type void.}
]

Example:
@racketblock[
(add-property
 my-spec-component
 reference-info
 [Reference (read)]
 [Assignment (write name #:unifies Expression)])
]
}

@defform[#:kind "spec-property" #:id reference-choice-info reference-choice-info]{
This property allows you to bias reference choice.

The property takes a function that takes three arguments:
@itemlist[
@item{The node}
@item{A (potentially empty) list of @racket[binding?] reference options}
@item{A boolean that tells whether lifting a new definition is an option}
]

The function must return one of the options, the symbol @racket['lift] to lift a fresh definition, or @racket[#f].
If it returns @racket[#f], then you can't have a reference there, and you have to deal with that in your fuzzer.
Returning @racket[#f] is probably a bad idea.
The default shouldn't ever return @racket[#f].
The default value biases towards choosing parameters over definitions and lexically closer bindings over far ones.

You can give different values to different nodes, but a default on @racket[#f] is probably good enough?

Here is an example that randomly chooses any option available, that only lifts when there are no existing options:
@racketblock[
(add-property
 my-spec-component
 reference-choice-info
 [#f (λ (n options lift-available?)
       (if (null? options)
           (and lift-available? 'lift)
           (random-ref options)))])]
}

@defform[#:kind "spec-property" #:id binding-structure binding-structure]{
This property is used on nodes that can have binders as children.
It determines the visibility of those binders to their siblings.
Options are @racket['serial] (like @verb{let*} in scheme), @racket['parallel] (like @verb{let} in scheme), and @racket['recursive] (like @verb{letrec} in scheme).

If the property is not specified, @racket['serial] is assumed and used as a default.

Example:
@racketblock[
(add-to-grammar
 my-spec-component
 [Let #f ([definitions : Definition *] Expression)]
 [Letstar #f ([definitions : Definition *] Expression)]
 [Letrec #f ([definitions : Definition *] Expression)]
 )
(add-property
 my-spec-component
 binding-structure
 [Let 'parallel]
 (code:comment "Letstar we can leave blank if we want because serial is the default.")
 [Letrec 'recursive])
]
}


@defform[#:kind "spec-property" #:id strict-child-order? strict-child-order?]{
Specifies that a node's children are guaranteed by the language to have a strict evaluation order.
The default is false.
This property is used to determine whether nodes have a dependency in their read/write/io effect conditions.
(Those conditions are set by the @racket[io] and @racket[reference-info] properties.)

Setting this property is unnecessary, but using it allows more liberal use of references, mutation, and io effects.

Example:
@racketblock[
(add-property
 my-spec-component
 strict-child-order?
 (code:comment "Most languages have some sort of sequential construct, like a block")
 [Block #t])
]
}

@defform[#:kind "spec-property" #:id mutable-container-access
         mutable-container-access]{
Used to specify whether a node reads or writes a mutable container.
Here “container” means any kind of record, array, list, or compound data type.
If you have any kind of mutable record, array, etc, tag the read and write nodes with this effect.

The property takes a list of either the identifier @tt{read} or the identifier @tt{write}, then an expression for the key for the kind of mutable container.
The key can be anything, but it needs to be @racket[eq?] for each access of the same container type and not @racket[eq?] for accesses to different container types.
Eg. you could use the type constructor, or just a symbol for the name of the type.

Example:
@racketblock[
(add-property
 my-spec-component
 mutable-container-access
 [MutableRecordGetField (read 'MutableRecord)]
 [MutableRecordSetField (write 'MutableRecord)]
 [MutableArrayReference (read 'MutableArray)]
 [MutableArraySet (write 'MutableArray)])]
}

@defform[#:kind "spec-property" #:id io io]{
Used to specify that a node has some kind of IO effect, such as printing or reading a volatile variable.

Example:
@racketblock[
(add-property
 my-spec-component
 io
 [Print #t])
]
}


@defform[#:kind "spec-property" #:id lift-predicate lift-predicate]{
This property specifies a predicate for whether a definition of a given type can be lifted to a node.

Example:
@racketblock[
(add-to-grammar
 my-spec-component
 [Let #f ([definitions : Definition *] Expression)]
 [Letstar #f ([definitions : Definition *] Expression)]
 [Letrec #f ([definitions : Definition *] Expression)]
 )
(add-property
 my-spec-component
 lift-predicate
 (code:comment
  "Allow any definition type to be lifted into the top level of a program.")
 [Program (λ (n type) #t)]
 (code:comment
  "Lifting a definition to Lambda's formal parameter list would require changing all calls.")
 [Lambda (λ (n type) #f)]
 (code:comment
  "Allow local variables to be lifted, except make all functions top-level.")
 [Let (λ (n type) (not (function-type? type)))])
]
}

@defform[#:kind "spec-property" #:id lift-type->ast-binder-type
lift-type->ast-binder-type]{
If you have more than one binding node in your language (IE via @racket[binder-info]) you must specify this property.
This property should be defined once for the base node (#f).
It is a mapping from the type of a desired definition (eg. int, float, int -> int, ...) to the AST node type (eg. VariableDefinition, FunctionDefinition).
This is important when different kinds of definitions use different AST nodes.
Otherwise it is just boilerplate...
@; introduces these private rules:

Example:
@racketblock[
(add-to-grammar
 my-spec-component
 [VariableDefinition #f (name type Expression)]
 [FunctionDefinition #f (name type Body)]
 )
(add-property
 my-spec-component
 lift-type->ast-binder-type
 [#f (λ (type) (if (function-type? type)
                   'FunctionDefinition
                   'VariableDefinition))])
]
}


@defform[#:kind "spec-property" #:id choice-filters-to-apply choice-filters-to-apply]{

This property accepts a syntax list of choice-method names to use as a filter for the node type.  Generally this should be set on the greatest super node type (or @racket[#f] if there is no explicit super node type in your grammar).  Each choice-method in the list is called on the choice object with no arguments.  Each rule that returns @racket[#f] rules the node out as a choice for filling in a hole.

Uses for this include restricting where certain nodes can appear.
For example, the easiest way to create a fuzzer for a language with functions is to have a Lambda expression node.
However, some languages do not support first-class functions.
But you can still encode the fuzzer as having a lambda node that is simply restricted to only be generated as children of definition nodes, or perhaps only global definition nodes.

In this code snippet, we define a choice method encoding this restriction, and add it to the @racket[choice-filters-to-apply]:
@racketblock[
(add-choice-method
 my-component
 no-lambda-except-global-def
 [#f (λ () #t)]
 [Lambda (λ ()
           (and (parent-node current-hole)
                (equal? (ast-node-type (parent-node current-hole))
                        'Definition)
                (parent-node (parent-node current-hole))
                (equal? (ast-node-type (parent-node (parent-node current-hole)))
                        'Program)))])
(add-property my-component
          choice-filters-to-apply
          [#f (no-lambda-except-global-def)])
]

Some core methods are always applied in addition to this list, such as the method defined by the @racket[may-be-generated] property.
If you don't make custom filtering rules you don't need to specify this property.


}


@defform[#:kind "spec-property" #:id edit edit]{

The @racket[edit] property allows program trees to be edited during program elaboration.
The main purpose of the @racket[edit] property is to control ordering when filling out a tree.
Xsmith doesn't guarantee an ordering when filling in holes, but provides the @racket[edit] property to take manual control.


The edit property is specified as a procedure that takes the node in question and returns either @racket[#f] to signify that there is no edit necessary or a thunk that performs the edit.
The returned thunk @emph{must} perform an edit in such a way that the @racket[edit] property's procedure will return @racket[#f] in the future, or editing will loop indefinitely.
The edit property may be specified multiple times, each with a different procedure that potentially performs an edit.
If multiple edit procedures are specified, the ordering of the procedures is not guaranteed, so if an order between them is necessary, they must have their own method of signalling between them.

One way the edit property may be used to stage edits is to specify a @racket[fresh] property that fills nodes as bud nodes (with @tt{create-ast-bud}), then check for that node's dependencies (eg. sibling nodes) in the edit property.
When the dependencies are appropriately filled, the edit property can then replace the bud node with a hole node (using @tt{rewrite-subtree} and @racket[make-hole]) to be filled in normally.
Note that if you follow this pattern, you need to take care in other properties (such as @racket[type-info]) to check whether children are bud nodes before trying to query their attributes.

@racketblock[
(add-to-grammar
 my-component
 (code:comment "be sure the left node is filled before the right node")
 [Addition Expression ([l : Expression] [r : Expression = (create-ast-bud)])
           #:prop edit (λ (n) (and (ast-bud-node? (ast-child 'r n))
                                   (not (att-value 'xsmith_is-hole?
                                                   (ast-child 'l n)))
                                   (λ () (rewrite-subtree (ast-child 'r n)
                                                          (make-hole 'Expression)))))])
]

}


@defform[#:kind "spec-property" #:id render-node-info render-node-info]{

Xsmith provides built-in pretty printer functionality used for final program output and debugging support.  This is given as a function which takes in one argument (a node) and renders that node in whatever format you like.  Common formats include plain strings, PPrint documents, or s-expressions.  If your @racket[render-node-info] functions don't return strings, then you must implement the @racket[#:format-render] argument of the @racket[define-xsmith-interface-functions] function to convert the final rendered AST to a string for pretty-printing as output.
During debugging, this property may be called on a hole instead of a filled-in node.  If this happens, Xsmith will delegate to the @racket[render-hole-info] property, detailed below.
The rendering function is defined as the @rule[xsmith_render-node] attribute.

Example:
@racketblock[
(add-property
 my-spec-component
 render-node-info
 [#f (λ (node) (symbol->string (ast-node-type node)))])
]
}


@defform[#:kind "spec-property" #:id render-hole-info render-hole-info]{

For help with debugging, this property allows you to render holes.
By default, this function simply returns a string containing the type of the hole wrapped in angle brackets.  If you have specified a custom @racket[render-node-info] property and that property returns some type other than a string, it is likely you will want to configure this property to return the same type.

Example:
@racketblock[
(add-property
 render-hole-info
 [#f (λ (hole) (format "<~a>" (symbol->string (ast-node-type hole))))])
]
}

@section{Types}

These type constructors and other functions are largely useful for specifying the @racket[type-info] property.

While there are various predicates for different types, at any point in type checking you might actually have a type variable instead of a concrete type.
So if you want to check if you have a particular type (and maybe deconstruct it), you should maybe create an instance of the type you are interested in, check if it @racket[can-unify?], then @racket[unify!]-ing it if you want to deconstruct it.
Note that if you do @racket[unify!] a type variable, that unification needs to be consistent between multiple runs of type checking (since it runs multiple times as the tree is constructed).
In other words, if you randomly choose a type at any point, you need to store that type in a grammar attribute and consistently unify against it.


@defproc[(type? [t any/c]) bool?]{
Predicate for types.
}

@defproc[(fresh-type-variable [args type?] ...) type?]{
Creates a fresh type variable.  If given no arguments it is unconstrained and can unify with any type.  Arguments can be provided, in which case the type variable is constrained to be one of the types given.
In the optional arguments, only one function type is allowed.

Example:
@racketblock[
(code:comment "Unconstrained")
(define v1 (fresh-type-variable))

(define int (base-type 'int))
(define float (base-type 'float))
(define bool (base-type 'bool))

(define v2 (fresh-type-variable int bool))

(unify! v1 v2)

(can-unify? v1 float) (code:comment "#f")
(can-unify? v1 int) (code:comment "#t")

(unify! v2 bool)
(can-unify? v1 int) (code:comment "#f")
]
}

@defproc[(fresh-subtype-of [t type?]) type?]{
Creates a fresh type variable that is constrained to be a subtype of @racket[t].
}

@defproc[(type-variable? [t any/c]) bool?]{
Predicate for type variables.
}

@defproc[(can-unify? [t1 type?] [t2 type?]) bool?]{
Returns whether two types can be unified without actually unifying them.

Note that if @racket[(can-unify? t1 t2)] is true, then both @racket[(can-subtype-unify? t1 t2)] and @racket[(can-subtype-unify? t2 t1)] are true, but it is NOT the case that @racket[(can-subtype-unify? t1 t2)] and @racket[(can-subtype-unify? t2 t1)] imply that @racket[(can-unify? t1 t2)] is true!
}
@defproc[(unify! [t1 type?] [t2 type?]) void?]{
Unifies two types.  This mutates type variables so that they match other variables or types going forward.

If unification fails an exception is raised.  Right now a failure to unify might mean that type variables are left in a bad state, so code generation should just give up at that point.

}

@defproc[(can-subtype-unify? [sub type?] [super type?]) bool?]{
Returns whether two types can be subtype-unified without actually unifying them.
}
@defproc[(subtype-unify! [sub type?] [super type?]) void?]{
Puts two types into a subtype relationship.
This mutates type variables so that they are constrained in a lattice relationship with other type variables.

Note that
@racketblock[
(begin
  (subtype-unify! sub super)
  (subtype-unify! super sub))
]
is equivalent to:
@racketblock[
(unify! sub super)
]

If unification fails an exception is raised.
A failure in unification is basically catastrophic, so no code generation should be attempted after a unification failure.

The @racket[subtype-unify!] function is used automatically during type checking to put a node's type in the subtype relationship with the type its parent provides it, so you probably don't need to use this function manually.
The @racket[unify!] function is more useful in user code.
}

@defproc[(base-type [name symbol?] [supertype (or/c #f base-type?) #f] [#:leaf? leaf? any/c #t]) type?]{
Creates a base type.  Base types are the same if they are @racket[eq?].  The @racket[name] field is really just for convenience in printing for debugging.

If @racket[leaf?] is true, no subtypes of this base type can be created.
}

@defproc[(base-type? [x any/c]) any/c]{
Predicate for base-types.
}

@defproc[(base-type-name [bt base-type?]) symbol?]{
Get the name of a base type.
}


@defproc[(function-type [arg-type type?] [return-type type?]) type?]{
Creates a function type.
For multi-argument functions, use a @racket[product-type] for the argument type.
}
@defproc[(function-type? [t any/c]) bool?]{
Predicate for function types.
}
@defproc[(function-type-arg-type [t function-type?]) type?]{
Get the argument type.
Remember that you can't deconstruct type variables that are not fully constrained!
}
@defproc[(function-type-return-type [t function-type?]) type?]{
Get the return type.
Remember that you can't deconstruct type variables that are not fully constrained!
}


@defproc[(product-type [types (or/c (listof types?) #f)]) type?]{
Creates a product type (tuple).  If @racket[types] is @racket[#f], the length of the tuple is unspecified, and it can be @racket[unify!]-ed with a product type of any length.

Example:
@racketblock[
(define any-length (product-type #f))
(define l2 (product-type (list int int)))
(define l3 (product-type (list int int int)))

(can-unify? any-length l2) (code:comment "#t")
(can-unify? any-length l3) (code:comment "#t")
(unify! any-length l2)
(can-unify? any-length l2) (code:comment "#t")
(can-unify? any-length l3) (code:comment "#f")
]
}

@defproc[(product-type? [t any/c]) bool?]{
Predicate for product types.
}

@defproc[(product-type-inner-type-list [t product-type?]) any/c]{
Returns the list of types in the product-type, or @racket[#f] for a product-type with a length that is still unspecified.
}

@defform[(define-generic-type name (field-spec ...))
         #:grammar
         ([field-spec [field-name variance]
                      field-name]
          [variance invariant
                    covariant
                    contravariant])]{
Used to create generic types.

This form defines a constructor @racket[name], a predicate @racket[name]@verb{?}, and one accessor for each type-argument @racket[name]@verb{-}@racket[type-argument] a la @racket[struct].
If no variance is given for a field, it is invariant when subtyping.

Each instance of a generic type can be unified with other instances of the same generic.

Example:
@racketblock[
(define number (base-type 'number))
(define int (base-type 'int number))
(define-generic-type list-type ([type covariant]))

(define t1 (list-type number))
(generic-type? t1) (code:comment "#t")
(list-type? t1) (code:comment "#t")
(generic-type-type-arguments t1) (code:comment "(list number)")
(list-type-type t1) (code:comment "number")

(define t2 (list-type int))
(can-subtype-unify? t1 t2) (code:comment "#f")
(can-subtype-unify? t2 t1) (code:comment "#t")
(can-unify? t1 t2) (code:comment "#f")
]
}

@defproc[(generic-type? [t any/c]) bool?]{
Returns true when @racket[t] is a generic type.
Not very useful, since you probably want to know if it is an instance of a specific generic.
}

@defproc[(generic-type-name [t generic-type?]) symbol?]{
Returns the name of a generic type.
Remember that you can't deconstruct type variables that are not fully constrained!
}
@defproc[(generic-type-type-arguments [t generic-type?]) (listof type?)]{
Returns the inner types of a generic type as a list.
Remember that you can't deconstruct type variables that are not fully constrained!
}


@defproc[(nominal-record-type? [t any/c]) bool?]{
Predicate for nominal record types.

Partially specified @racket[nominal-record-type?]s are created with @racket[nominal-record-type-with].
Fully specified @racket[nominal-record-type?]s are created by using @racket[concretize-type] on a partially specified one.
Rather than making them manually, simply rely on Xsmith's definition-lifting mechanism to create appropriate fully-specified @racket[nominal-record-type?]s.

When a partially defined @racket[nominal-record-type] is @racket[unify!]-ed with a fully defined @racket[nominal-record-type], the partially defined one is mutated to become the same as the fully defined one.
When two partially defined @racket[nominal-record-type]s are unified together, an error is raised.

Every node in a language grammar that stands for a @racket[nominal-record-type] constructor, accessor, or mutator must include a reference to a @racket[nominal-record-definition-type] containing the @racket[nominal-record-type] being used.

The reason for this is that nominal types must be defined in the program.
@racket[nominal-record-definition-type]s are necessary because the lookups of these type names are a different type than uses of the record type that the name is bound to.

Example:
@racketblock[
(add-to-grammar my-grammar
 ...
 [VariableReference Expression (name)]
 ...
 [StructReference Expression (fieldname
                              [structdefref : VariableReference]
                              [structval : Expression])]
 ...
 )

(add-property
 my-grammar
 type-info
 ...
 [StructReference [(fresh-type-variable)
                   (λ (n t)
                     (define type-with-field
                       (nominal-record-type-with (ast-child 'fieldname n) t))
                     (hash 'structval type-with-field
                           'structdefref (nominal-record-definition-type
                                          type-with-field)))]]
 ...
 )
]
}

@defproc[(nominal-record-type-with [fieldname string?] [fieldtype type?])
type?]{
Creates a partially-specified @racket[nominal-record-type?].
Use it to specify that you want a record that contains a certain type.
}

@defproc[(any-nominal-record-type) type?]{
Creates a completely unconstrained @racket[nominal-record-type?].
It will unify with any fully constrained @racket[nominal-record-type?].
}

@defproc[(nominal-record-type-name [t nominal-record-type?]) any/c]{
Getter for the name of a @racket[nominal-record-type?].
}
@defproc[(nominal-record-type-known-field-dict [t nominal-record-type?]) dict?]{
Getter for the inner type dictionary.
If you use this on a partially-specified @racket[nominal-record-type?], you will get an incomplete dictionary.
}

@defproc[(nominal-record-definition-type [t nominal-record-type?]) type?]{
Constructor.
See note in @racket[nominal-record-type?] for how it is used.
}
@defproc[(nominal-record-definition-type? [t any/c]) bool?]{
Predicate for nominal record definition types constructed with @racket[nominal-record-definition-type].
}
@defproc[(nominal-record-definition-type-type [t nominal-record-definition-type?]) nominal-record-type?]{
Getter for the @racket[nominal-record-type] inside a @racket[nominal-record-definition-type].
}

@defproc[(structural-record-type? [v any/c]) bool/c]{
Predicate for structural record types.
}
@defproc[(fresh-structural-record-type
          [field-dict (hash/c symbol? type?) (hash)]
          [#:finalized? finalized? any/c #f])
         type?]{
Constructor for a structural record type.
If @racket[finalized?] is @racket[#f], the result is variable and may have fields added during unification.
}
@defproc[(structural-record-type-known-field-dict [srt structural-record-type?]) (hash/c symbol? type?)]{
Get the field/type mapping held by @racket[srt].

Because this may be updated by unification, and type exploration is lazy where possible, you should use @racket[force-type-exploration-for-node!] before using this on the type of any particular node.
}

@defproc[(force-type-exploration-for-node! [n ast-node?]) void/c]{
Type exploration is lazy, and is only done far enough for the built-in algorithm to check whether potential fresh nodes will have appropriate types for a given hole.

If you need to manually inspect details of types inside @racket[type-info], you should use this function on the node whose type is in question to be sure the returned type reflects a maximally unified view.

TODO - explain better when you need to use this.

}

@defproc[(settled-type? [t type?]) bool?]{
A settled type is a type that either has no variables (a la @racket[type-has-no-variables?]) or that has type variables that have been completely constrained.

In other words, all type variables contained in this type (including @racket[product-type]s, @racket[nominal-record-type]s, and @racket[structural-record-type]s) have been unified such that they only have one option, which is itself settled.
If you have a variable that is not settled and you need a settled one, you can use @racket[concretize-type] on the type you have.

See warnings in @racket[concretize-type] about its use!

Note that through @racket[subtype-unify!], variables may be effectively settled without passing the @racket[settled-type?] predicate.
Xsmith uses @racket[subtype-unify!] internally, so this could be an issue even if you don't manually use it.
When a variable is subtype-unified to be the subtype of a base type, say @tt{string}, the variable can unify with the range (@racket[#f], @tt{string}).
If @tt{string} has no subtypes, then @tt{string} is the only type it can be unified with.
However, the @racket[settled-type?] function doesn't currently have access to the list of subtypes (because currently they can be created dynamically), so it doesn't know that it's effectively settled.
}

@defproc[(type-has-no-variables? [t type?]) bool?]{
If this returns true, then @racket[t] is both a @racket[settled-type?] AND has no type variable wrappers (IE created at some point by @racket[fresh-type-variable]).
Note that it may have @racket[product-type]s, @racket[nominal-record-type]s, and @racket[structural-record-type]s, but only ones that have been settled.

Mostly this is just useful to tell whether you can confidently use projection functions (eg. @racket[product-type-inner-type-list]) without running into type variable wrappers.
}

@defproc[(concretize-type [t type?]) type?]{
Returns a type that @racket[can-unify?] with @racket[t], but that is both a @racket[settled-type?] and a @racket[type-has-no-variables?].
Note that it does @emph{not} @racket[unify!] or @racket[subtype-unify!] the returned type with the input type.

This function can be useful if you want to generate a random type or proactively make a decision about an unsettled type you have where you need a settled, concrete type.
But beware!  You should probably NOT generate random types or unify with concretized types unless you also store them in the grammar node that represents that type.
The type-checking code defined in the @racket[type-info] property can be flushed and re-run many times for each node, so a node that randomly chooses its type will not be stable.
Because the type algorithm imperatively unifies types, this causes mayhem.  Don't do it.

Note that to use this function you must pass a value in to the @racket[#:type-thunks] argument of @racket[define-xsmith-interface-functions].
}


@section{Miscellaneous Utilities}

@defproc[(fresh-int!) number?]{
Returns a unique integer.  The state of the generator is reset for each program generated, so that generation is reproducible.

Basically, use this rather than using your own unique number generator.
}
@defproc[(fresh-var-name [template string?]) string?]{
Returns a name created by appending a fresh integer to the end of @racket[template].

Example:
@racketblock[
(fresh-var-name "variable_") (code:comment "returns the string \"variable_123\", or something like it.")
]
}

@defproc[(enqueue-inter-choice-transform [thunk procedure?]) void?]{
You can't mutate the RACR tree during attribute evaluation.
But sometimes you really want to.
At any rate, if your attribute (perhaps defined by a property) needs to save some information in the tree for future iterations (such as a type rule that proactively concretizes a type), you can use this to make tree changes between the filling of one tree hole and another.

In other words, if your attribute needs to use @racket[rewrite-terminal], stick that computation in a thunk, use this on it, and it will be evaluated after the current round of attribute evaluation but before the next one.
}


@section{Debug Logging}
@defproc[(xd-printf [format-string string?] [args (listof any/c)] ...) any/c]{
Like @racket[printf], but it prints to a buffer that is output when an exception is raised during program generation.
}
@defproc[(datt-value [method symbol?] [node ast-node?] [arg any/c] ...) any/c]{
A wrapper for RACR's @racket[att-value] function that prints trace info using @racket[xd-printf].
}



@section{Turning Your Grammar Specification Into a Program}

Use the @racket[xsmith-define-interface-functions] macro.  This section used to have more before things were rearranged.  Maybe things should be rearranged more.

@defproc[(xsmith-feature-enabled? [feature symbol?]) boolean?]{
Returns the value set by the user via @racket[define-xsmith-interface-functions] for the given feature.
The feature name must have been supplied to the #:features argument of @racket[define-xsmith-interface-functions], or an error will be raised.
}

@defproc[(xsmith-max-depth) number?]{
Returns the maximum tree generation depth as set by the user via @racket[define-xsmith-interface-functions] or on the command line.
}

@section{RACR Convenience Functions}
@defmodule[xsmith/racr-convenience]

These are a group of convenience functions around @(racr).
They are not necessary, and some may not be the best choices.
But I have found them a little more convenient than using certain @(racr) APIs directly.

@defform[(expr->ast-list length-expression field-expression)]{
Creates an @racket[ast-list-node?] containing a list of length @racket[length-expression].
For each element of the list, @racket[field-expression] is evaluated again.
}

@defproc[(node-type [n any/c]) any/c]{
Returns the symbol of the type of n, or #f if n is not a proper non-bud, non-list @racket[ast-node?].

Wrapper for @racket[ast-node-type] that returns false rather than erroring when it gets bud nodes or list nodes...
}

@defproc[(parent-node [n any/c]) any/c]{
Wrapper for ast-parent that returns #f rather than erroring when the given node doesn't have a parent.
}

@defproc[(top-ancestor-node [n any/c]) any/c]{
Calls @racket[parent-node] until it reaches the last parent, and returns it.
}

@defproc[(node-subtype? [n any/c]) any/c]{
Wrapper for @racket[ast-subtype?] that returns #f rather than erroring when the given node is a bud, list, or non-node.
}

@section{xsmith/app}
@defmodule[xsmith/app]

The xsmith/app module provides a convenient @tt{#%app} replacement for accessing attributes (AKA methods, or attributes on AST nodes) and methods (AKA choice-methods) on choice objects.

See @secref["application" #:doc '(lib "scribblings/reference/reference.scrbl")] for more details on @tt{#%app}.

Note that these bindings are @italic{not} provided by the main @tt{xsmith} module.

@defform[(#%app form ...)]{
When the first form (after the [probably implicit] @racket[#%app] identifier) is a quoted symbol, the form is treated as a method application.

In short, if the node @tt{n} is an @racket[ast-node?], then:
@racketblock[($xsmith_type n)]
is essentially rewritten as:
@racketblock[(att-value 'xsmith_type n)]

Additionally, if @tt{n} is an @racket[object?] (probably a @racket[choice-object?]), then:
@racketblock[($choice-method-name n 1 2 3)]
is essentially rewritten as:
@racketblock[(send n choice-method-name 1 2 3)]

In practice, whether @tt{n} is an @racket[ast-node?] or @racket[object?] can't be determined statically, so it is tested at runtime.

If the first form is not a quoted symbol, then the @racket[racket/base:#%app] from is used.
}

@defform[(define-xsmith-app xsmith-app-name inner-app prefix)]{
Defines a macro like @racket[#%app] above, but using @racket[inner-app] as the fallback instead of @racket[racket/base:#%app].
Use this if you want to combine the xsmith/app behavior with another customized @tt{#%app} implementation.

Additionally, @racket[prefix] is used instead of @tt{$}.  @racket[prefix] is given as a symbol.

@racketblock[(define-xsmith-app my-xsmith-app #%plain-app $^!)]
}

@section[#:tag "canned-components"]{Canned Components}
@defmodule[xsmith/canned-components]

The abstract grammars for many languages are very similar.
We provide a library of canned components to get fuzzers up and running quickly for languages that include common patterns.
All uses of canned components should probably start with @racket[define-basic-spec-component].

Note that @racket[add-basic-expressions] and @racket[add-basic-statements] aren't good abstractions, they just facilitate some shared code of common components.  You ultimately still have to know the structure of all components added.

The canned-components also export various types that the canned grammar nodes use.
Of note, statements use two statement types: @racket[return-type] and @racket[no-return-type].
The expressions use all the other provided types.

For some examples that use these canned components, see the @tt{xsmith-examples/simple} directory.

Note that these bindings are @italic{not} provided by the main @tt{xsmith} module.

@defform[(define-basic-spec-component grammar-component)]{
Defines the @verb{grammar-component} using @racket[define-spec-component], then adds definitions for @verb{Expression}, @verb{Statement}, @verb{Definition}, @verb{DefinitionNoRhs}, and @verb{FormalParameter}.

The @verb{Expression} and @verb{Statement} nodes have no fields and cannot be generated by the grammar, but serve as scaffolding for other productions in @racket[add-basic-expressions], @racket[add-basic-statements], and your own definitions.

The @verb{Definition}, @verb{DefinitionNoRhs}, and @verb{FormalParameter} nodes all have both @verb{type} and @verb{name} fields.
The @verb{Definition} node also has an @verb{Expression} field.

@verb{DefinitionNoRhs} is meant to be used in forms that implicitly add a definition to subforms, but whose definition is not necessarily of the same type as a child the definition comes from.
For example, a loop form where you bind a variable to each element of a given list.
}

@defform[(add-basic-expressions grammar-component optional ...)
         #:grammar
         [(optional
           [code:line #:ProgramWithSequence boolean]
           [code:line #:VoidExpression boolean]
           [code:line #:AssignmentExpression boolean]
           [code:line #:IfExpression boolean]
           [code:line #:LambdaWithExpression boolean]
           [code:line #:LambdaWithBlock boolean]
           [code:line #:LetSequential boolean]
           [code:line #:ExpressionSequence boolean]
           [code:line #:Booleans boolean]
           [code:line #:Strings boolean]
           [code:line #:MutableArray boolean]
           [code:line #:MutableArraySafeAssignmentExpression boolean]
           [code:line #:ImmutableArray boolean]
           [code:line #:ImmutableList boolean]
           [code:line #:MutableStructuralRecord boolean]
           [code:line #:MutableStructuralRecordAssignmentExpression boolean]
           [code:line #:ImmutableStructuralRecord boolean]
           )]]{
Extends @racket[grammar-component] with an expression language.
All nodes added come with @racket[type-info] and other necessary properties specified, but they lack the @racket[render-node-info].
In other words, this form gives you everything but the pretty printing for these parts of your language.

TODO - maybe just show the add-to-grammar segments in the lists of what is added.
For now, you really just need to look at the source of canned-components to see what it is.  It's not a real abstraction, but I don't want to tediously document all the internal names right now.

The following top-level node types are always added to the grammar:

@itemlist[
@item{An abstract @tt{Expression} node (IE @racket[may-be-generated] is false).}
]

If @racket[#:ProgramWithSequence] is true,
@tt{(ProgramWithSequence #f ([definitions : Definition *] ExpressionSequence))}
is added.  Use this when your language is free from the nonsense of statements.

The following @tt{Expression} node types are always added:

TODO - document all of the field names

@itemlist[
@item{@tt{VariableReference} with @tt{name}}
@item{@tt{ProcedureApplication}}
@;@item{@tt{NumberLiteral} (abstract)}
@item{@tt{IntLiteral} -- uses @racket[int] type}
@item{@tt{Plus}}
@item{@tt{Minus}}
@item{@tt{Times}}
@item{@tt{SafeDivide}}
@item{@tt{LessThan}}
@item{@tt{GreaterThan}}
]

Other options mostly add a single node type with the same name as the option.  The exceptions are:

TODO


Type considerations:
@itemlist[
@item{Numeric operations provided operate on the @racket[number] type.}
@item{Arrays use either @racket[(mutable (array-type (fresh-type-variable)))] or @racket[(immutable (array-type (fresh-type-variable)))].}
@item{Structural records use either @racket[(mutable (fresh-structural-record-type))] or @racket[(immutable (fresh-structural-record-type))].}
@item{Lists use @racket[(immutable (list-type (fresh-type-variable)))]}
]
}


@defform[(add-basic-statements grammar-component optional ...)
         #:grammar
         [(optional
           [code:line #:ProgramWithBlock boolean]
           [code:line #:AssignmentStatement boolean]
           [code:line #:ExpressionStatement boolean]
           [code:line #:MutableArraySafeAssignmentStatement boolean]
           [code:line #:MutableStructuralRecordAssignmentStatement boolean]
           )]]{
Like @racket[add-basic-expressions], extends @racket[grammar-component] with a statement language, providing all necessary properties except @racket[render-node-info].
If you use @racket[add-basic-statements], you must use @racket[add-basic-expressions] as well.

TODO - maybe just show the add-to-grammar segments in the lists of what is added.
For now, you really just need to look at the source of canned-components to see what it is.  It's not a real abstraction, but I don't want to tediously document all the internal names right now.

The following node types are always added to the grammar:

@itemlist[
@item{An abstract @tt{Statement} node (IE @racket[may-be-generated] is false).}
@item{@tt{ReturnStatement} with @tt{Expression}}
@item{@tt{Block} (a Statement subtype) with @tt{definitions} and @tt{statements}}
@item{@tt{IfElseStatement} with @tt{test} @tt{then} @tt{else}}
]

The optional arguments add a node with the same name as the keyword.


Type considerations:
@itemlist[
@item{@tt{ReturnStatement} is of type @racket[(return-type (fresh-type-variable))]}
@item{@tt{Block} and @tt{IfElseStatement} are of type @racket[(fresh-maybe-return-type)]}
@item{@tt{AssignmentStatement}, @tt{ExpressionStatement}, and others are of type @racket[no-return-type]}
]
}

@defform[#:kind "spec-property" #:id block-user? block-user?]{
Property for nodes that use the Block node as a child to support definition children as well as a list of statements.
When given @racket[#t], the child Block won't increase the calculated AST depth.

@racketblock[
(add-basic-statements my-component
                      ...
                      #:Block #t
                      ...)
(add-to-grammar my-component
                [OneArmedIfStatement Statement
                                     ([test : Expression] [then : Block])
                                     #:prop block-user? #t])
]
}

@defform[(add-loop-over-container grammar-component kw-arg ...)
#:grammar
[(kw-arg
  [code:line #:name identifier]
  [code:line #:loop-ast-type identifier]
  [code:line #:body-ast-type identifier]
  [code:line #:bind-whole-collection? boolean]
  [code:line #:collection-type-constructor function]
  [code:line #:loop-type-constructor function]
  [code:line #:body-type-constructor function]
  [code:line #:loop-variable-type-constructor function]
  )]]{
Adds a looping form named @racket[#:name] to @racket[grammar-component].
The looping node will be of ast-type @racket[loop-ast-type], which defaults to @verb{Expression}.
The looping node will have 3 children:
@itemlist[
@item{@verb{collection} - if @racket[#:bind-whole-collection?] is @racket[#t], it will be of AST node type @verb{Definition}, otherwise it will be of AST node type @verb{Expression} (the default).  This represents the collection to be looped over, and the @racket[#:bind-whole-collection?] is a convenience to make references to the whole collection available.}
@item{@verb{elemname} - of AST node type @verb{DefinitionNoRhs} which itself has @verb{name} and @verb{type} fields but no @verb{Expression} field.  This node represents the binding to an element of the list in each iteration of the body.}
@item{@verb{body} - of AST node type @racket[body-ast-type], which defaults to @verb{Expression}.  Changing this is useful to make loops whose body is a @verb{Statement} or @verb{Block}.}
]

The @racket[#:collection-type-constructor] should be a function from the type inside the collection to the type of the collection.

The @racket[#:loop-type-constructor] should be a function from the type inside the collection to the type of the whole loop.
By default @racket[#:loop-type-constructor] is the same as @racket[#:collection-type-constructor], which corresponds to a loop that forms a comprehension form with a result (similar to Racket's @racket[for/list] form).
However, common values for @racket[#:loop-type-constructor] include @racket[(λ (elem-type) void-type)] for loop expressions that only side-effect, or @racket[(λ (elem-type) (fresh-maybe-return-type))] for loops in those silly statement languages.

The @racket[#:body-type-constructor] should be a function from the loop type and the element type to the type of the loop body.
By default @racket[#:body-type-constructor] returns the element type, but for side-effectful loops and/or statement-bodied loops it should be something else.
For example, a statement-bodied loop should have @racket[#:body-type-constructor (λ (loop-type elem-type) loop-type)].

The @racket[#:loop-variable-type-constructor] should be a function from the type inside the collection to the type of the loop variable.
By default this is the identity function.
However, you can override this to make a loop over a container where the loop variable is, say, an integer that indexes the collection rather than an element of the collection.
The details of how you make the bound name match the type are essentially up to the @racket[render-node-info] rules you write.


Most keyword arguments are optional, but @racket[#:name] is required.

Example:
@racketblock[
(add-loop-over-container
 python-comp
 (code:comment "Sure, Python calls them lists, but my type system calls them arrays.")
 #:name ArrayComprehension
 #:collection-type-constructor (λ (elem-type) (mutable (array-type elem-type))))
(add-property python-comp render-node-info
          [ArrayComprehension
           ;; [body for binder_name in collection]
           (λ (n) (h-append (text "[")
                            ($xsmith_render-node (ast-child 'body n))
                            (text " for ")
                            (text (ast-child 'name (ast-child 'elemname n)))
                            (text " in ")
                            ($xsmith_render-node (ast-child 'collection n))
                            (text "]")))])
]

}


@defproc[(return-type [t type?]) type?]{
A type used for statements in return position.

This is used to encode where a return statement is needed and what type it must be.
In other words, the block in a @tt{LambdaWithBlock} has @racket[return-type], the last statement in the block is unified to have the same type.

For the curious, it is implemented as a @racket[generic-type?] with a covariant inner type.
}

@defthing[no-return-type base-type?]{
A type for statements that don't return.  In other words, it's @racket[void-type] for statements.
}

@defproc[(fresh-maybe-return-type) type?]{
Use this for compound statements that can optionally contain a return statement.

IE @racket[(fresh-type-variable (return-type (fresh-type-variable)) no-return-type)].
}

@defthing[void-type base-type?]{
A void type.  Used by @tt{AssignmentExpression} and the like.
}
@defthing[number-type base-type?]{}
@defthing[int-type base-type?]{Subtype of @racket[number]}
@defthing[float-type base-type?]{Subtype of @racket[number]}
@defthing[bool-type base-type?]{}
@defthing[string-type base-type?]{}

@defproc[(mutable [t type?]) type?]{
Used for encoding mutable versions of types.
Eg. canned components provide @racket[(mutable (array-type (fresh-type-variable)))] and similar.

It is implemented as a @racket[generic-type?] with a covariant inner type.
}
@defproc[(immutable [t type?]) type?]{
Used for encoding immutable versions of types, much like @racket[mutable].
}

@defproc[(array-type [t type?]) type?]{
Constructor for array types.
Implemented as a @racket[generic-type?] with a covariant inner type.
The canned components only use this wrapped in @racket[mutable] or @racket[immutable].
}

@defproc[(list-type [t type?]) type?]{
Constructor for list types.
Implemented as a @racket[generic-type?] with a covariant inner type.
The canned components only use this wrapped in @racket[mutable] or @racket[immutable].
}


@;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

@; End of file.
